AITopics

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsFeb-17-2026, 20:01:24 GMT

Supplementary materials for Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing Anonymous Author(s) Affiliation Address email A Additional graphs from outlier analysis

We also present some additional ablation studies.

artificial intelligence, attention layer, machine learning, (16 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsFeb-17-2026, 20:01:18 GMT

Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing

Due to their size, the capability of these networks has increased tremendously, but this has come at the cost of a significant increase in necessary compute.

large language model, machine learning, natural language, (17 more...)

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Dominican Republic (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(5 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.69)

Neural Information Processing SystemsFeb-15-2026, 10:41:34 GMT

Birder: Communication-Efficient 1-bit Adaptive Optimizer for Practical Distributed DNN Training

Therefore, from a system-level perspective, the design ethos of a system-efficient communication-compression algorithm is that we should guarantee that the compression/decompression of the algorithm is computationally light and takes less time, and it should also be friendly to efficient collective communication primitives.

artificial intelligence, machine learning, natural language, (20 more...)

Country: Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Neural Information Processing SystemsFeb-8-2026, 13:05:59 GMT

4b26dc4663ccf960c8538d595d0a1d3a-Supplemental.pdf

emote, experiment, explanation, (17 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Lamaakal, Ismail, Yahyati, Chaymae, Maleh, Yassine, Makkaoui, Khalid El, Ouahbi, Ibrahim

BayesQ: Uncertainty-Guided Bayesian Quantization

arXiv.org Artificial IntelligenceNov-13-2025

We present BayesQ, an uncertainty-guided post-training quantization framework that is the first to optimize quantization under the posterior expected loss. BayesQ fits a lightweight Gaussian posterior over weights (diagonal Laplace by default; optional K-FAC/low-rank), whitens by the posterior covariance, designs codebooks to minimize posterior-expected distortion, and allocates mixed precision via a greedy knapsack that maximizes marginal expected-loss reduction per bit under a global budget. For scalar quantizers, posterior-expected MSE yields closed-form tables; task-aware proxies are handled by short Monte Carlo on a small calibration set. An optional calibration-only distillation aligns the quantized model with the posterior predictive teacher. At matched average bits/weight of 3.0/3.5/4.0, BayesQ improves over strong PTQ baselines on ResNet-50 (ImageNet) and BERT-base (GLUE) e.g., vs. GPTQ by $+1.5/+0.7/+0.3$ top-1 percentage points on RN50 and $+1.1/+0.4/+0.2$ GLUE points on BERT, while requiring one-time preprocessing comparable to a GPTQ pass. BayesQ reframes low-bit quantization as uncertainty-aware risk minimization in a practical, post-training pipeline.

large language model, machine learning, quantization, (22 more...)

2511.08821

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceOct-22-2025

Lightweight Baselines for Medical Abstract Classification: DistilBERT with Cross-Entropy as a Strong Default

Liu, Jiaqi, Wang, Tong, Liu, Su, Hu, Xin, Tong, Ran, Wang, Lanruo, Xu, Jiexi

The research evaluates lightweight medical abstract classification methods to establish their maximum performance capabilities under financial budget restrictions. On the public medical abstracts corpus, we finetune BERT base and Distil BERT with three objectives cross entropy (CE), class weighted CE, and focal loss under identical tokenization, sequence length, optimizer, and schedule. DistilBERT with plain CE gives the strongest raw argmax trade off, while a post hoc operating point selection (validation calibrated, classwise thresholds) sub stantially improves deployed performance; under this tuned regime, focal benefits most. We report Accuracy, Macro F1, and WeightedF1, release evaluation artifacts, and include confusion analyses to clarify error structure. The practical takeaway is to start with a compact encoder and CE, then add lightweight calibration or thresholding when deployment requires higher macro balance.

distilbert, machine learning, natural language, (17 more...)

2510.10025

Country: North America > United States > California (0.28)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Health & Medicine > Diagnostic Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

arXiv.org Artificial IntelligenceOct-17-2025

FedHFT: Efficient Federated Finetuning with Heterogeneous Edge Clients

Ilhan, Fatih, Tekin, Selim Furkan, Huang, Tiansheng, Liu, Gaowen, Kompella, Ramana, Eisenhauer, Greg, Lin, Yingyan Celine, Pu, Calton, Liu, Ling

Fine-tuning pre-trained large language models (LLMs) has become a common practice for personalized natural language understanding (NLU) applications on downstream tasks and domain-specific datasets. However, there are two main challenges: (i) limited and/or heterogeneous data for fine-tuning due to proprietary data confidentiality or privacy requirements, and (ii) varying computation resources available across participating clients such as edge devices. This paper presents FedHFT - an efficient and personalized federated fine-tuning framework to address both challenges. First, we introduce a mixture of masked adapters to handle resource heterogeneity across participating clients, enabling high-performance collaborative fine-tuning of pre-trained language model(s) across multiple clients in a distributed setting, while keeping proprietary data local. Second, we introduce a bi-level optimization approach to handle non-iid data distribution based on masked personalization and client clustering. Extensive experiments demonstrate significant performance and efficiency improvements over various natural language understanding tasks under data and resource heterogeneity compared to representative heterogeneous federated learning methods.

heterogeneity, large language model, natural language, (15 more...)

2510.14054

Country:

North America > United States > Texas (0.28)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.87)

Solgi, Ryan, Zhen, Kai, Swaminathan, Rupak Vignesh, Susanj, Nathan, Mouchtaris, Athanasios, Kunzmann, Siegfried, Zhang, Zheng

Saten: Sparse Augmented Tensor Networks for Post-Training Compression of Large Language Models

arXiv.org Artificial IntelligenceOct-14-2025

The efficient implementation of large language models (LLMs) is crucial for deployment on resource-constrained devices. Low-rank tensor compression techniques, such as tensor-train (TT) networks, have been widely studied for over-parameterized neural networks. However, their applications to compress pre-trained large language models (LLMs) for downstream tasks (post-training) remains challenging due to the high-rank nature of pre-trained LLMs and the lack of access to pretraining data. In this study, we investigate low-rank tensorized LLMs during fine-tuning and propose sparse augmented tensor networks (Saten) to enhance their performance. The proposed Saten framework enables full model compression. Experimental results demonstrate that Saten enhances both accuracy and compression efficiency in tensorized language models, achieving state-of-the-art performance.

large language model, machine learning, natural language, (18 more...)

2505.14871

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)